Multi-channel Encoder for Neural Machine Translation

نویسندگان

Hao Xiong

Zhongjun He

Xiaoguang Hu

Hua Wu

چکیده

Attention-based Encoder-Decoder has the effective architecture for neural machine translation (NMT), which typically relies on recurrent neural networks (RNN) to build the blocks that will be lately called by attentive reader during the decoding process. This design of encoder yields relatively uniform composition on source sentence, despite the gating mechanism employed in encoding RNN. On the other hand, we often hope the decoder to take pieces of source sentence at varying levels suiting its own linguistic structure: for example, we may want to take the entity name in its raw form while taking an idiom as a perfectly composed unit. Motivated by this demand, we propose Multi-channel Encoder (MCE), which enhances encoding components with different levels of composition. More specifically, in addition to the hidden state of encoding RNN, MCE takes 1) the original word embedding for raw encoding with no composition, and 2) a particular design of external memory in Neural Turing Machine (NTM) for more complex composition, while all three encoding strategies are properly blended during decoding. Empirical study on Chinese-English translation shows that our model can improve by 6.52 BLEU points upon a strong open source NMT system: DL4MT. On the WMT14 EnglishFrench task, our single shallow system achieves BLEU=38.8, comparable with the state-of-the-art deep models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DCU System Report on the WMT 2017 Multi-modal Machine Translation Task

We report experiments with multi-modal neural machine translation models that incorporate global visual features in different parts of the encoder and decoder, and use the VGG19 network to extract features for all images. In our experiments, we explore both different strategies to include global image features and also how ensembling different models at inference time impact translations. Our s...

متن کامل

Incorporating Global Visual Features into Attention-based Neural Machine Translation

We introduce multi-modal, attentionbased Neural Machine Translation (NMT) models which incorporate visual features into different parts of both the encoder and the decoder. Global image features are extracted using a pre-trained convolutional neural network and are incorporated (i) as words in the source sentence, (ii) to initialise the encoder hidden state, and (iii) as additional data to init...

متن کامل

Evaluating Discourse Phenomena in Neural Machine Translation

For machine translation to tackle discourse phenomena, models must have access to extrasentential linguistic context. There has been recent interest in modelling context in neural machine translation (NMT), but models have been principally evaluated with standard automatic metrics, poorly adapted to evaluating discourse phenomena. In this article, we present hand-crafted, discourse test sets, d...

متن کامل

Exploiting Source-side Monolingual Data in Neural Machine Translation

Neural Machine Translation (NMT) based on the encoder-decoder architecture has recently become a new paradigm. Researchers have proven that the target-side monolingual data can greatly enhance the decoder model of NMT. However, the source-side monolingual data is not fully explored although it should be useful to strengthen the encoder model of NMT, especially when the parallel corpus is far fr...

متن کامل

Bridging Source and Target Word Embeddings for Neural Machine Translation

Neural machine translation systems encode a source sequence into a vector from which a target sequence is generated via a decoder. Different from the traditional statistical machine translation, source and target words are not directly mapped to each other in translation rules. They are at the two ends of a long information channel in the encoder-decoder neural network, separated by source and ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1712.02109 شماره

صفحات -

تاریخ انتشار 2017

Multi-channel Encoder for Neural Machine Translation

نویسندگان

چکیده

منابع مشابه

DCU System Report on the WMT 2017 Multi-modal Machine Translation Task

Incorporating Global Visual Features into Attention-based Neural Machine Translation

Evaluating Discourse Phenomena in Neural Machine Translation

Exploiting Source-side Monolingual Data in Neural Machine Translation

Bridging Source and Target Word Embeddings for Neural Machine Translation

عنوان ژورنال:

اشتراک گذاری